The Stan Math Library: Reverse-Mode Automatic Differentiation in C++
نویسندگان
چکیده
As computational challenges in optimization and statistical inference grow ever harder, algorithms that utilize derivatives are becoming increasingly more important. The implementation of the derivatives that make these algorithms so powerful, however, is a substantial user burden and the practicality of these algorithms depends critically on tools like automatic differentiation that remove the implementation burden entirely. The Stan Math Library is a C++, reverse-mode automatic differentiation library designed to be usable, extensive and extensible, efficient, scalable, stable, portable, and redistributable in order to facilitate the construction and utilization of such algorithms. Usability is achieved through a simple direct interface and a cleanly abstracted functional interface. The extensive built-in library includes functions for matrix operations, linear algebra, differential equation solving, and most common probability functions. Extensibility derives from a straightforward object-oriented framework for expressions, allowing users to easily create custom functions. Efficiency is achieved through a combination of custom memory management, subexpression caching, traits-based metaprogramming, and expression templates. Partial derivatives for compound functions are evaluated lazily for improved scalability. Stability is achieved by taking care with arithmetic precision in algebraic expressions and providing stable, compound functions where possible. For portability, the library is standards-compliant ar X iv :1 50 9. 07 16 4v 1 [ cs .M S] 2 3 Se p 20 15 C++ (03) and has been tested for all major compilers for Windows, Mac OS X, and Linux. It is distributed under the new BSD license. This paper provides an overview of the Stan Math Library’s application programming interface (API), examples of its use, and a thorough explanation of how it is implemented. It also demonstrates the efficiency and scalability of the Stan Math Library by comparing its speed and memory usage of gradient calculations to that of several popular open-source C++ automatic differentiation systems (Adept, Adol-C, CppAD, and Sacado), with results varying dramatically according to the type of expression being differentiated. 1. Reverse-Mode Automatic Differentiation Many contemporary algorithms require the evaluation of a derivative of a given differentiable function, f , at a given input value, (x1, . . . , xN), for example a gradient, ( ∂f ∂x1 (x1, . . . , xN) , · · · , ∂f ∂xN (x1, . . . , xN) ) , or a directional derivative, ~v(f) (x1, . . . , xN) = N ∑ n=1 vn ∂f ∂xn (x1, . . . , xN) . Automatic differentiation computes these values automatically, using only a representation of f as a computer program. For example, automatic differentiation can take a simple C++ expression such as x * y / 2 with inputs x = 6 and y = 4 and produce both the output value, 12, and the gradient, (2, 3). Automatic differentiation is implemented in practice by transforming the subexpressions in the given computer program into nodes of an expression graph (see Figure 1, below, for an example), and then propagating chain rule evaluations along these nodes (Griewank and Walther, 2008; Giles, 2008). In forward-mode automatic differentiation, each node k in the graph contains both a value xk and a tangent, tk, which represents the directional derivative of xk with respect to the input variables. The tangent values for the input values are initialized with values ~v, because that represents the appropriate directional derivative of each input variable. The complete set of tangent values is calculated by propagating tangents forward from the inputs to the outputs with the rule ti = ∑ j∈children[i] ∂xi ∂xj tj. A special case of a directional derivative computes derivatives with respect to a single variable by setting ~v to a vector with a value of 1 for the single distinguished variable and 0 for all other variables.
منابع مشابه
A sparse matrix approach to reverse mode automatic differentiation in Matlab
We review the extended Jacobian approach to automatic differentiation of a user-supplied function and highlight the Schur complement form’s forward and reverse variants. We detail a Matlab operator overloaded approach to construct the extended Jacobian that enables the function Jacobian to be computed using Matlab’s sparse matrix operations. Memory and runtime costs are reduced using a variant ...
متن کاملA Reverse-Mode Automatic Differentiation in Haskell Using the Accelerate Library
Automatic Differentiation is a method for applying differentiation strategies to source code, by taking a computer program and deriving from that program a separate program which calculates the derivatives of the output of the first program. Because of this, Automatic Differentiation is of vital importance to most deep learning tasks as it allows for the easy backpropogation of complex calculat...
متن کاملThe Study of Automatic and Controlled Data Processing Speed Based on the Stroop Test in Students with Math Learning Disability
Introduction: The study of individual differences in information processing in order to predict the academic achievement of students with math disability is of great importance. The purpose of this study was to study automatic and controlled data processing speed based on the Stroop test in students with math learning disability. Materials and Methods: This descriptive study was causal-comparat...
متن کاملAn Overview of High Order Reverse Mode
Automatic Differentiation (AD) is increasingly an important component of Machine Learning (ML) packages. For evaluating the gradient, the first order reverse mode also known as back-propagation, is optimal and is widely used. However, the functionalities of current mainstream ML packages for evaluating second and higher order derivatives are limited. One reason is that high order derivatives ar...
متن کاملAutomatic Differentiation and Backpropagation CS701
This lecture discusses the relationship between automatic differentiation and backpropagation. Automatic differentiation (AD) is a technique that takes an implementation of a numerical function f (computed using floating-point numbers) and creates an implementation of f . We explain several techniques for performing AD. For forward-mode AD, we give an explicit transformation of the program, as ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1509.07164 شماره
صفحات -
تاریخ انتشار 2015